Filtering and Classifying Relevant Short Text with a Few Seed Words
نویسندگان
چکیده
منابع مشابه
Title: Hierarchically Classifying Documents Using Very Few Words Authors: Hierarchically Classifying Documents Using Very Few Words
The proliferation of topic hierarchies for text documents has resulted in a need for tools that automatically classify new documents within such hierarchies. One can use existing classi ers by ignoring the hierarchical structure, treating the topics as separate classes. Unfortunately, in the context of text categorization, we are faced with a large number of classes and a huge number of relevan...
متن کاملHierarchically Classifying Documents Using Very Few Words
The proliferation of topic hierarchies for text documents has resulted in a need for tools that automatically classify new documents within such hierarchies. Existing classiication schemes which ignore the hierarchical structure and treat the topics as separate classes are often inadequate in text classiication where the there is a large number of classes and a huge number of relevant features ...
متن کاملExploring text datasets by visualizing relevant words
When working with a new dataset, it is important to first explore and familiarize oneself with it, before applying any advanced machine learning algorithms. However, to the best of our knowledge, no tools exist that quickly and reliably give insight into the contents of a selection of documents with respect to what distinguishes them from other documents belonging to different categories. In th...
متن کاملOKSAT at NTCIR-12 Short Text Conversation Task: Priority to Short Comments, Filtering by Characteristic Words and Topic Classification
Our group OKSAT submitted five runs for Chinese and Japanese subtasks of the NTCIR-12 Short Text Conversation task (STC). We searched not only posts but also comments for terms of each query (post). We also gave more priority to short comments than longer ones. Then we filtered retrieved comments by characteristic words including proper nouns. We added attributes to the corpus and also to the q...
متن کاملBinary Words with Few Squares
A short proof is given for a result of Fraenkel and Simpson [Electronic J. Combinatorics 2 (1995), 159–164] stating that there exists an infinite binary word which has only three different squares u.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Data and Information Management
سال: 2019
ISSN: 2543-9251
DOI: 10.2478/dim-2019-0011